Paraspinal muscle parameters’ predictive value for new vertebral compression fractures post-vertebral augmentation: Nomogram development and validation

Objective Prior research underscores the significance of paraspinal muscles in maintaining spinal stability. This study aims to investigate the predictive value of paraspinal muscle parameters for the occurrence of new vertebral compression fractures (NVCF) following percutaneous vertebroplasty (PVP) or percutaneous kyphoplasty (PKP) in patients with osteoporotic vertebral compression fractures (OVCF). Methods Retrospectively collected data from October 2019 to February 2021 (internal validation, n = 235) and March 2021 to November 2021 (external validation, n = 105) for patients with OVCF treated with PVP/PKP at our institution. They were randomly divided into training (188 cases) and validation groups (47 cases) at an 8:2 ratio. Lasso regression and multivariable logistic regression identified independent risk factors in the training set, and a Nomogram model was developed. Accuracy was assessed using receiver operating characteristic curves (ROC), calibration was evaluated with calibration curves and the Hosmer-Lemeshow test, and clinical utility was analyzed using decision curve analysis (DCA) and clinical impact curve (CIC). Results Surgical approach, spinal computed tomography (CT) values, and multifidus skeletal muscle index (SMI) are independent predictors of postoperative NVCF in OVCF patients. A Nomogram model, based on the identified predictors, was developed and uploaded online. Internal validation results showed area under the curve (AUC) values of 0.801, 0.664, and 0.832 for the training set, validation set, and external validation, respectively. Hosmer-Lemeshow goodness-of-fit tests (χ2 = 7.311–14.474, p = 0.070–0.504) and calibration curves indicated good consistency between observed and predicted values. DCA and CIC demonstrated clinical net benefit within risk thresholds of 0.06–0.84, 0.12–0.23, and 0.01–0.27. At specificity 1.00–0.80, the partial AUC (0.106) exceeded that at sensitivity 1.00–0.80 (0.062). Conclusion Compared to the spinal CT value, the multifidus SMI has certain potential in predicting the occurrence of NVCF. Additionally, the Nomogram model of this study has a greater negative predictive value.


Introduction
Osteoporosis, defined by the World Health Organization as a bone mineral density (BMD) T-score < −2.5 measured through dualemission x-ray absorptiometry (DXA), is a prevalent condition affecting 30% of women and 12% of men (1).Osteoporosis signifies compromised bone mass and reduced strength, thereby elevating the risk of complications such as osteoporotic vertebral compression fractures (OVCF) and spinal deformities.OVCF stands as the most common osteoporotic fracture globally, with an incidence of approximately 30-50% in individuals aged 50 and above (2).Primary treatments for OVCF include percutaneous vertebroplasty (PVP) and percutaneous kyphoplasty (PKP).There is ongoing debate about which technique is superior.Some studies indicate that PKP can significantly increase the vertebral height after surgery and reduce the risk of cement leakage (3).Despite the maturity of these minimally invasive techniques, postoperative complications such as new vertebral compression fractures (NVCF) warrant attention, with reported incidence ranging from 2 to 52% (4,5).Existing studies indicate age, gender, body mass index (BMI), BMD, and cement leakage as potential risk factors for NVCF (6,7).
Given the spine's role as a complex, multi-joint structure, its stability is crucial for maintaining normal posture control and trunk movement.Panjabi (8) defined that spinal stability is determined by the complex interaction of three systems: the passive subsystem (vertebrae, intervertebral discs, facet joints, spinal ligaments), the active subsystem (paraspinal muscles), and the neural control subsystem.Dysfunction in any component of this system can alter stability, leading to pathological decompensation.Osteoporosis and sarcopenia, prevalent musculoskeletal disorders in the aging population, often coexist.With the global aging trend, their incidence is expected to rise (9).In sarcopenia, degeneration of paraspinal muscles (fat infiltration, fibrosis, and atrophy) reduces spinal stability, increasing the risk of OVCF (10).Concurrently, existing research confirms magnetic resonance imaging (MRI) as a gold standard for assessing overall and local skeletal muscle mass, subcutaneous adipose tissue, and visceral adipose tissue.In OVCF patients, MRI not only reveals vertebral morphological changes but also clearly displays post-fracture traumatic bone marrow edema (11).This offers an opportunity for assessing skeletal muscle mass in such patients.Several studies have identified a correlation between MRI-measured paraspinal muscle fat infiltration and NVCF (11)(12)(13).Infiltration of muscle fat leads to functional muscle reduction, inevitably affecting spinal stability.Therefore, this study aims to investigate the predictive value of MRI-assessed paraspinal muscle parameters for NVCF following PVP/PKP and to develop and validate a corresponding Nomogram.This study strictly adheres to the guidelines of the "Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD)," with Supplementary materials providing detailed information.

Paraspinal muscle
Participants underwent preoperative examination using a 3.0 T MRI system, obtaining supine T2-weighted axial images.The maximum cross-sectional area (CSA) of the paraspinal muscles is typically located between the L3/L4 and L4/L5 disc levels based on anatomical features, while the largest CSA of the psoas major muscle is identified at the L4/L5 disc level (14).Therefore, muscle measurements in this study focused on the paraspinal muscles at the L4/5 disc level.Using the sample function in R Studio, 30 patients were randomly selected without replacement.Their T2-weighted axial images were imported into Image J software for measurement by two radiologists blinded to the patients' outcomes (Reader 1, with 15 years of experience in MRI diagnosis; Reader 2, with 8 years of experience).The preoperative CSA of the bilateral multifidus/erector spinae/psoas major muscles and the fat CSA of these muscles were measured (Figure 1).After 1 month, Reader 2 performed repeated measurements for the same 30 patients.In cases where intra-and inter-observer correlation coefficients were ≥ 0.75, it indicated good consistency in the measurements between the two readers and for the same reader.Subsequently, all paraspinal muscle data were independently measured by Reader 2. Functional cross-sectional area (FCSA) was obtained by calculating the difference between paraspinal muscle CSA and paraspinal muscle fat CSA.To standardize for variations in patient height, the skeletal muscle index (SMI) for paraspinal muscles was calculated as Paraspinal Muscle SMI = Paraspinal Muscle FCSA (mm 2 ) ÷ Height2 (m 2 ).The average SMI of paraspinal muscles on both sides was used as the paraspinal muscle SMI.

Fractured vertebral characteristics
Reader 2 measured, from lateral X-ray images, OVCF segment (thoracic segment < T11, thoracolumbar segment T11-L2, lumbar segment > L2), fracture shape (wedge, biconcave, crush), fracture compression percentage, pre−/postoperative anterior vertebral height (AVH), preoperative adjacent upper and lower vertebral average AVH of the compressed vertebra, pre−/postoperative Cobb angles (formed by the upper and lower endplates of the compressed vertebra), and cement leakage.Using these measurements, additional calculations were performed, including the anterior vertebral height ratio (AVHR) = Compressed vertebra AVH ÷ Average AVH of adjacent upper and lower vertebrae × 100%, anterior vertebral height recovery ratio (AVHRR) = Postoperative AVHR − Preoperative AVHR, and Cobb angle change = Preoperative Cobb angle − Postoperative Cobb angle.Postoperative lateral X-ray images of all patients were obtained within 3 days after the surgery.
Using MRI and three-dimensional reconstruction of spinal CT, we measured intravertebral clefts (IVC) and spinal CT values.IVC were identified in CT as gas density within the vertebral body, displaying characteristic "double-line sign" on MRI with T1-weighted low signal and T2-weighted high signal.Spinal CT values were determined in the adjacent vertebral body of the fractured vertebra.The vertebral body was evenly divided into three sections in the sagittal plane, and the average CT value was calculated in the maximum elliptical region of interest containing only trabecular bone (Supplementary Figure S1).

Imaging assessment method
Incorporated risk factors encompassed general characteristics such as age, gender, education level, occupation, weight, BMI, diabetes, hypertension, smoking, drinking, preoperative Visual Analog Scale (VAS) score, postoperative VAS score, and time to first ambulation post-surgery.Treatment measures included antiosteoporosis treatment, surgical approach, puncture pathway, Volume of injected bone cement, and duration of surgery.Vertebral features comprised multiple vertebral fractures, fracture segment, fracture compression percentage, fracture shape, IVC, spinal CT values, AVHRR, Cobb angle changes, and cement leakage.Laboratory examinations involved white leukocyte, hemoglobin, urea, creatinine, C-reactive protein/albumin ratio (CAR).Parameters of paraspinal muscles included multifidus/erector spinae/psoas major SMIs.A 2-year follow-up utilized outpatient visits, electronic medical record tracking, and telephone follow-ups.Anti-osteoporotic treatment referred to regular use of bisphosphonates during the follow-up period.

Statistical methods
All statistical analyses were conducted using R studio software (v4.2.3, http://www.rproject.org/).Bonferroni correction was applied using the "agricolae" package.Intra-and inter-observer correlation coefficients (ICC) was assessed using the "irr" package, with results ≥0.75 considered indicative of good consistency.Predictive factors for NVCF occurrence were selected through Least Absolute Shrinkage and Selection Operator (LASSO) regression using the "glmnet" package.Logistic regression was performed using the "rms" package.The Nomogram was plotted using the "nomogramFormula" package.ROC curves (receiver operating characteristic curve) and partial ROC (pROC) were generated using the "pROC" package.The "epiR" package was utilized for confusion matrix analysis to evaluate model predictive performance.Calibration curves and the Hosmer-Lemeshow goodness-of-fit test were produced using the "rms" package.Decision Curve Analysis (DCA) and Clinical Impact Curve (CIC) were plotted using the "rmda" package.Statistical significance was set at p < 0.05.Additional details are available in Supplementary information S3.

General results
The internal validation set included 235 patients, comprising 53 males and 182 females, with an age range of 62-97 years and a mean age of 75.04 ± 7.97 years (The comparison between NVCF and No-NVCF patients is shown in Supplementary Table S1, where there is no significant difference in the occurrence of NVCF between patients in the PVP group and those in the PKP group, p = 0.065).Among these patients, 37 experienced NVCF, resulting in a recurrence rate of 15.74%.The external validation set comprised 105 patients, with 30 males and 75 females, aged 60-93 years, and a mean age of 75.60 ± 8.59 years.NVCF occurred in 9 cases, yielding a recurrence rate of 8.57%.The intra-and inter-observer correlation coefficients were between 0.829 and 0.989, indicating good consistency.Utilizing the "createDataPartition" function in the "caret" package of the R, the internal validation set was randomly allocated into training (n = 188) and validation (n = 47) sets at an 8:2 ratio (Table 1).The detailed process of internal validation dataset partitioning is presented in Supplementary information S4, Supplementary Table S4, and Supplementary Figure S2.

General result the difference of paraspinal muscle SMI in gender and NVCF groups and its correlation with other predictive factors
Grouped by the occurrence of NVCF, the paraspinal muscle SMI in the internal validation set showed no statistical differences (p = 0.212-0.714),as detailed in Supplementary Table S1.Upon grouping the internal validation set by gender, the multifidus SMI was 145.42 ± 60.76 for males and 143.49 ± 60.54 for females (p = 0.937).The erector spinae SMI was 302.10 ± 111.08 for males and 333.16 ± 124.53 for females (p = 0.085).Additionally, the lumbar erector spinae SMI was 285.30 ± 84.75 for males and 292.14 ± 93.93 for females (p = 0.765).
In the correlation analysis, the multifidus SMI was positively correlated with hemoglobin (R = 0.150) in all patients (235) of the internal validation set, the erector spinae SMI was positively correlated with hemoglobin (R = 0.212), urea (R = 0.132) and creatinine (R = 0.162), and the psoas major SMI was positively correlated with hemoglobin (R = 0.269) and creatinine (R = 0.247).In the NVCF patients (37) of the internal validation set, the multifidus SMI was negatively correlated with vertebral compression percentage (R = -0.331), the erector spinae SMI was positively correlated with hemoglobin (R = 0.466), and the psoas major SMI was positively correlated with creatinine (R = 0.362) and spinal CT value (R = 0.336) (Supplementary Table S3).

Nomogram model development
To avoid overfitting, each feature required at least 10-15 patients for model development (15,16).With 188 patients in the training set, the maximum feature limit was set at 18. Lasso regression was employed for variable selection.The model performed well with five predictors at lambda = 0.038 (Figure 2).Body weight, surgical approach, duration of surgery, spinal CT values, and multifidus SMI were included in the logistic regression model, revealing that surgical approach, spinal CT values, and multifidus SMI (p < 0.05) were independent predictors for post-vertebroplasty NVCF (Table 2).A Nomogram model was developed using these three predictors (Figure 3) and made available at https://sofarnomogram.shinyapps.io/NVCFnomogram/.

Nomogram model's clinical utility
Results from DCA curves in the training, testing, and validation sets indicated maximal clinical net benefit at risk thresholds of 0.06-0.84,0.12-0.23,and 0.01-0.27,respectively.The predictive model demonstrated significant additional clinical net benefit in identifying low-risk cases of NVCF within these ranges (Figure 4).Risk stratification for 1,000 predictions using CIC revealed consistently higher predicted cases of NVCF compared to actual occurrences within the threshold probability range (Figure 4).The pAUC SP (0.106) at specificity 1.00-0.80surpassed pAUC SE (0.062) at sensitivity 1.00-0.80,with a Bootstrap p-value (5,000 repetitions) of 0.044 (Figure 5).

Discussion
In our study, we found that when the training set and testing set were divided in an 8:2 ratio, multifidus SMI became a protective factor for NVCF occurrence.Compared with 6:4 (spinal CT value + surgical approach), Delong test p-value was 0.215, and there was no significant difference in pAUC between the two models at specificity and sensitivity of 1-0.75 and 1-0.80 (p = 0.644-0.783).To explore the impact of paraspinal muscle SMI on NVCF occurrence, we chose the model with 8:2 ratio division.Previous studies have shown that gender, BMI, duration of surgery, bone cement leakage and IVC are predictive factors for NVCF, so we performed subgroup analysis, and the results showed that the model was not affected by these factors (AUC: 0.703-0.837;p = 0.232-0.977).At the same time, we also found that postoperative time to first ambulation (≤3 days) did not affect the model.
Osteoporosis, characterized by bone loss and disrupted microstructure, renders bones fragile and prone to low-energy trauma.The spine and hips are common sites for osteoporotic fractures, affecting about half of postmenopausal women (17).Notably, vertebral fractures escalate the risk of NVCF by 4-7 times, with increasing risk correlating with the number of vertebral fractures (18).In our study, OVCF patients treated with PVP/PKP had a 2-year postoperative NVCF probability of 15.74% (37/235), within the reported range of 2-52%.According to the Nomogram results, spinal CT value is the most significant factor influencing NVCF occurrence.Interestingly, as the number of cases increased, paraspinal muscle SMI emerged as a protective factor against NVCF, with Delong test confirming its non-significant impact.This contrasts with previous findings on the significant influence of paraspinal muscle parameters in NVCF occurrence (19)(20)(21).We assert that paraspinal muscle SMI, particularly multifidus SMI, holds predictive potential for Previous studies commonly employed BMD to assess osteoporosis severity, with DXA (2D planar projection technique) being the most widely used clinical method (22).However, DXA has limitations such as low resolution and an inability to directly image bone microstructure (e.g., cortical bone and trabeculae) (23).Its accuracy can be compromised, with error rates reaching up to 20% in patients with spinal deformities, degenerative diseases, or calcification, due to alterations in spinal structure (24).Thus, there is a need to explore novel methods for measuring BMD.In recent years, quantitative CT has garnered widespread attention for measuring spinal BMD.The Hounsfield Unit (HU) in CT provides a method for assessing local BMD, with studies demonstrating its correlation with DXA results (25).Research indicates that HU values measured by CT exhibit higher sensitivity and specificity (26).Additionally, according to Yang et al. (27) research, a low T-score may not necessarily be an independent risk factor for NVCF.This could be attributed to the prevalence of severe degenerative changes and compensatory osteophyte formation in the vertebrae of elderly patients with OVCF, potentially affecting the accuracy of T-scores.In contrast, spinal CT values, with their higher resolution, offer a better reflection of overall vertebral bone quality.Consequently, our study employed spinal CT values for BMD assessment and identified vertebral CT values as an independent protective factor for postoperative NVCF, aligning with the findings of Bian et al. (28).
As mentioned in the introduction, the spine, as a functional unit, exhibits close interactions between its skeletal structure and the paraspinal muscle system.The maximum mechanical load borne by the skeleton primarily results from the dynamic contraction activities of muscles.This dynamical stimulation can positively promote bone growth and remodeling by increasing periosteal tension caused by muscle mass and the traction effect of collagen fibers, thereby regulating bone density, enhancing bone strength and affecting bone microstructure (9,29).Therefore, the decline of paraspinal muscle function will increase the risk of osteoporosis.Jeon et al. (30) revealed paraspinal muscle fat infiltration as a risk factor for vertebral collapse in OVCF patients.Additionally, Cheng et al. (21) found that post-PKP paraspinal muscle SMI at the L4 level is an independent protective factor against NVCF (OR 0.830).To account for varying patient heights, we standardized muscle FCSA across different individuals, as muscle volume typically correlates with patient height.Given the prevalent degenerative changes in vertebrae among the elderly, resulting in diffuse hypertrophy and an increase in vertebral CSA, this could potentially compromise the accuracy of assessing the musclevertebra index.Therefore, we employed SMI as a parameter for assessing muscle mass.Considering the functional variations among different paraspinal muscles, we conducted an analysis of the impact of each paraspinal muscle SMIs on NVCF.Our analysis revealed that multifidus SMI is an independent protective factor against NVCF after PVP/PKP.Consistent with Lee et al. (31), who found no difference in average CSA of multifidus between OVCF with osteoporosis, OVCF with bone loss, and the non-OVCF groups.However, multifidus FCSA in the OVCF group was significantly lower than the non-OVCF group.Logistic regression analysis indicated that multifidus FCSA at L4-5  In this study, the surgical approach emerged as one of the predictive factors for NVCF.Both PKP and PVP are minimally invasive procedures widely utilized in treating OVCF, proven to be safe and effective.Their fundamental mechanism involves injecting bone cement into the fractured vertebra, restoring mechanical stability, enhancing strength, and alleviating pain symptoms (37).However, the assessment of postoperative efficacy between PVP and PKP remains contentious in clinical practice.Griffoni et al. (38) conducted a long-term follow-up study comparing the This retrospective study selected OVCF patients treated with PVP/PKP in our hospital from October 2019 to November 2021.The internal validation set comprises cases from October 2019 to February 2021, and the external validation set includes cases from March 2021 to November 2021.Post PVP/PKP, NVCF was diagnosed based on recurring back pain, especially during movement, and confirmed by new wedge changes on X-rays or MRI (Supplementary Figure S1).Inclusion criteria: (1) Age ≥ 60 years; (2) Preoperative diagnostic support for OVCF through X-ray, three-dimensional reconstruction of spinal computed tomography (CT), and MRI; (3) Underwent vertebroplasty for the first time; (4) DXA-measured BMD T-score < −2.5.Exclusion criteria: (1) symptomatic pain due to other causes such as disc herniation or spondylolisthesis; (2) Fractures from high-energy trauma or tumor-related fractures; (3) History of previous lumbar spine surgery or clear trauma; (4) Neuro-musculoskeletal or endocrine disorders affecting paraspinal muscle function, or longterm use of steroid medications causing skeletal metabolism abnormalities; (5) Incomplete clinical data; and (6) Coexisting infections, severe cardiovascular or cerebrovascular diseases, or other congenital conditions.

FIGURE 1
FIGURE 1 Paraspinal muscles at the L4/5 Level.(A) T2-weighted image illustrating the multifidus muscle (MF), erector spinae muscle (ES), and psoas muscle (PS); (B) Image J analysis depicting the cross-sectional area of paraspinal muscles and the extent of fatty infiltration.

FIGURE 2 (
FIGURE 2 (A) Characterizing the variations of LASSO regression coefficients.(B) LASSO regression selects the optimal parameter lambda through crossvalidation.The dashed line on the right represents lambda values with average error within ±1 standard deviation, indicating improved model performance.

FIGURE 3 Developed
FIGURE 3Developed Nomogram based on multifactorial logistic regression analysis.
effectiveness and safety of PKP and PVP in treating OVCF.The results that while both procedures effectively restored vertebral height and improved spinal kyphosis, the risk of adjacent-segment NVCF was significantly higher in the PVP group than in the PKP group.Furthermore, Zhu et al.(39) demonstrated, in contrast to PVP and conservative treatment, PKP yielded superior improvements in quality of life and reduced the risk of postoperative NVCF in OVCF patients.Thus, PKP is considered the ideal choice for OVCF treatment, with a significantly lower incidence of adjacent vertebral fractures post-PKP.Consistent with our study's findings, the Nomogram results demonstrate a reduced probability of NVCF occurrence in patients undergoing PKP treatment.This study has several limitations.Firstly, this single-center retrospective study demonstrated good consistency within and between groups, validated externally.When the training set was split 8:2, Hosmer-Lemeshow goodness-of-fit tests for the training, testing, and validation sets all indicated acceptable model fits (all p > 0.05).However, as the training set proportion increased, a decreasing trend in the p-values of the Hosmer-Lemeshow test was observed.Thus, prospective large-scale clinical cohort studies are still needed for validation.Secondly, only three predictive variables were included, excluding many other factors such as physical activity.Future research should address these limitations and explore the impact of additional relevant factors.

TABLE 1
Comparison of characteristics among the training, testing, and validation sets.

TABLE 2
Logistic regression analysis of risk factors for NVCF occurrence in the training set patients.

TABLE 3
Diagnostic performance of Nomogram model in the training, testing, and validation sets., area under the receiver operating characteristic curve; PPV, positive predictive value; NPV, negative predictive value. AUC